Isip 2000 Conversational Speech Evaluation System

نویسندگان

R. Sundaram

A. Ganapathiraju

J. Hamaker

چکیده

In this paper, we describe the ISIP Automatic Speech Recognition system (ISIP-ASR) used for the Hub-5 2000 English evaluations. The system is a public domain cross-word context-dependent HMM based system and has all the functionality normally expected in an LVCSR system, including Baum-Welch training for continuous density HMMs, phonetic decision tree-based state-tying, word graph generation and rescoring. The acoustic models were trained on 60 hours of Switchboard and 20 hours of CallHome data. The system had a word error rate of 43.4% on Switchboard, 54.8% on CallHome, and an overall error rate of 49.1%. This paper describes the evaluation system in detail and discusses our post-evaluation experiments and improvements.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The ISIP Public Domain Decoder for Large Vocabulary Conversational Speech Recognition

متن کامل

The BBN Byblos 2000 conversational Mandarin LVCSR system

This paper describes the year 2000 BBN Byblos Mandarin large vocabulary conversational speech recognition (LVCSR) system, the winning (and only) Mandarin system from the Spring 2000 Hub-5 evaluation sponsored by NIST. We first outline the training and decoding procedures used in the system, and describe the performance of the system used in the evaluation. We then describe the effect of several...

متن کامل

Recognizing Call-center Speech Using Models Trained from Other Domains

In this paper, we introduce a new conversational speech task – recognizing call-center speech – using data collected from Dragon’s own technical support line. We compare performance of models trained from conversational telephone speech (the Switchboard corpus) and models trained from predominantly read, microphone speech, and report on a series of experiments focusing on adapting the microphon...

متن کامل

Rate-of-speech Modeling for Large Vocabulary Conversational Speech Recognition

Variations in rate of speech (ROS) produce changes in both spectral features and word pronunciations that affect automatic speech recognition (ASR) systems. To deal with these ROS effects, we propose to use parallel, rate-specific, acoustic models: one for fast speech, the other for slow speech. Rate switching is permitted at word boundaries, to allow modeling within-sentence speech rate variat...

متن کامل